Tracking in Reinforcement Learning
نویسندگان
چکیده
Reinforcement learning induces non-stationarity at several levels. Adaptation to non-stationary environments is of course a desired feature of a fair RL algorithm. Yet, even if the environment of the learning agent can be considered as stationary, generalized policy iteration frameworks, because of the interleaving of learning and control, will produce non-stationarity of the evaluated policy and so of its value function. Tracking the optimal solution instead of trying to converge to it is therefore preferable. In this paper, we propose to handle this tracking issue with a Kalman-based temporal difference framework. Complexity and convergence analysis are studied. Empirical investigations of its ability to handle non-stationarity is finally provided.
منابع مشابه
A Novel Dynamic Target Tracking Algorithm for Image Based on Two-step Reinforcement Learning
In this article, we modeled image target tracking into reinforcement learning framework, and we proposed a two-step reinforcement learning algorithm for target tracking. In this algorithm, we set multiple tracker agent to track the pixel of target, the intention of reinforcement learning is to achieve tracking strategy of every tracker agent, we divided each learning step of tracker into two pa...
متن کاملRepetitive Tracking Control of Nonlinear Systems Using Reinforcement Fuzzy-Neural Adaptive Iterative Learning Controller
This paper proposes a new fuzzy neural network based reinforcement adaptive iterative learning controller for a class of nonlinear systems. Different from some existing reinforcement learning schemes, the reinforcement adaptive iterative learning controller has the advantages of rigorous proofs without using an approximation of the plant Jacobian. The critic is appended into the reinforcement a...
متن کاملEye-Tracking Method’ Usage for Understanding the Cognitive Processes in Multimedia Learning
Introduction: Designing multimedia learning environments should consist of the evidence-based study and principals about the human learning process. Eye tracking is a way based on the learner processing of learning materials which presented in multimedia learning environments. The aim of the study was to examine the use of the eye-tracking method to investigate the cognitive processes in m...
متن کاملLearning optimal switching policies for path tracking tasks on a mobile robot
A set of impedance controllers is used for both state estimation and tracking control on a mobile robot. State estimation is based on the states of a family of impedance controllers and tracking is implemented through a single controller from this set. Reinforcement learning techniques are used to create switching policies that optimize time or energy in a path tracking task.
متن کاملExtrinsic Evaluation of Dialog State Tracking and Predictive Metrics for Dialog Policy Optimization
During the recent Dialog State Tracking Challenge (DSTC), a fundamental question was raised: “Would better performance in dialog state tracking translate to better performance of the optimized policy by reinforcement learning?” Also, during the challenge system evaluation, another nontrivial question arose: “Which evaluation metric and schedule would best predict improvement in overall dialog p...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2009